CCCOHSST H2S0S2 



ZD 126 107 



M 005 362 



iUI502 
5023 



2D2S PSICS 
S£SC2IPT02S 



Smith / Sonald S. 

The E2-20 Seliability Coeff iciest as^ a Special Case 
of a Bore Se^ieral Foriula, 
£ ipr -76 3 

ISp.; Paper presented at the trmual Beeriag of the 
imerican xdacationaj. 2esearcb isscciatios (60th, San 
Francisco, California, ipril 19-23, 1S76) 

K?-S0*83 5C-S1.67 Plas Postage* 

Cc»paratiTe inaiysis; Grade Point irerage^ I^edictiTe 
Lbilit j (Testing) ; Preaictiye Yalidity; 2espoi:i^' 



Ttyl6~(Tests) ; Scoring; ^statistical inaiysos; Test 
Interpretation; *Test 2eliafcility; ^Zxmed !rests; 
♦^Trne Scores ^ 

*Eader 2i«iardscn rorrala 20; Speeded Tests; le^ 
Ihecry . 



ID25IIFIZ2S 



IBSTriCT ^ 

The Ender 2ichardson-20 ?orirala is s^vn to be a 
special case^ vhere each exaitinee is giren sufficient tijie to ansver 
each itei, of a sore general forsnla vhere each examinee may not be' 
allowed the necessary time. The foriKila is extended to allow tvo 
scores, knowledge and speed, to be extracted froi each examinees test 
score* Csing a sample of 82 first guarter freshmen 'it was fonn^ that, 
compared to the siiple total scoare^ the two extracted scores gave 
better prediction of grade-point-average (gpa) in guaatitatire areas 
and was*egaally effective in predicting gpa in nongnantitative areas* 
(AQthor/D2F) 



* Documents acgnired by 22IC include many informal unpublished * 

* materials not available from other sources* 22IC makes every effort 

^ to obtain the best copy available. Hevertheless, items of marginal * 

* reproducibility are often encountered and this affects the goality * 

* of the Microfiche and hardcopy reproductions Z2IC maies available ^ 

* via the 22IC Document 2eproduction Service (2D2S) • 2D2S is not 

* responsible for the quality of the original document* Heproductions * 

* supplied, by 21}2S are the best that can be made from the original^ ^ * 



ERIC 



o 



TH2 SS-^ HZLIA3ILI2T CCEFTICnSI ^ A S2ECIAL 
C&SZ 0? A H02Z (SSEEM- rJEKEL& 



I>c?3ald Ssttth 
Ball State Caiversity 



d^CD CXJ^C'^L'' its »CCfrvfD «»OM 
ATaf»0-{T 90>««TSO^ k-'TwO* D^'NlO^ 



CD 
CO 



A p^er presented at the annual cpnvciition of T!he American Educational 
Eesearch Association, held in San Francisco, California, April 19th - 
April 23rd, 1976. . ■ 



ERIC 



T^t-Tx>dsct±ca 



0=e of t2ie basic asstmpcieas of classical test tlieory is that each scbject 
has had sole tims to acte3=>t erety item in a test. Under this assuii?ti.03 
en onAtted* it-ea repr^eats a true lacJ: of inovledge ard cai=i:«: be attrisnted , 
to a failure to reach the itenu It iS obvious that toder a restrictive tiiael^, 
defined as a period of ciae such that every testee does aot have sorriffient tips , 
to attempt every test ite:2, this assm?tioa casaot be inet. Several prerscas 
(Gallil:s€a, 1551; Crcabach Sr Karrijagtoa, 1951; 5el3is.C£dter & 0rt3neyer,^>3> 
devised fonralae that alio? estisates to be inade of the "speedesaess oT a test. 




v-fth the gn ceptloa of one forsala proposed by Croabach °^ 
these formulae ar7basfcd oa a_sisgle ac=iaistratioa oi C2£ z^y-^ iir.-t>lv« the 
consarisca of a measixre of the -.-an^e of orittei ite=3 to^^ther the total test - 
vsyizzgg- o i iL a t atAl ^crro r" - <-jiv a lly ^^f-s^p-l r^p r g£g^ of oadctea and 
incorrectly answered itens) variance. iathc:igh the v^^^us approaches to tne 
calcaiatica of the vaitie -asy var? sos^what, the has i^' rational underlytag tne 
fonmlae a^isears to be the ssie. Empirical suppoi;£^ for the fact that tney pro- 
vide sinil^ results i.s reported by Helms cadter^ Ortmsyer la their prevxonsly 
cited article. lau, an index proposed by Crezbach & Afarringtca , requires tc^t 
parallel form of the test be adndalstered jH^er tisad and uatimea coaditicas. 
rne correlations beceeea the four obtaiaea scores are thea used to determine the . 
proportica of the observed test varia^de that pen be attributed to speed. 



/ijasert. formla 1 eboot bere 



Although the theoretiwd^ Itoits of this index vould approach negative and positive 
infinity (assuming tkat the correlation betweea the parallel forms given under 
the same oondition veiTe zero), the practical limits are likely to be betve^ -3.00 
(in those cases viicx« both r,. and r^, are both 1.00 and Xy^ and r23 are both 0.^0) 
and :^rt) ('^en ^II four corrllatioas-tre the same value), this ability to have 
negative values is a desirable charac;»filstic that is not possessed by ai^ or tije 
other prepared indices. All of the^ther>toilae provide estimates of the pro- 
portion of test variance that may be attrib^le to "speed" that have a low-er . 
limit of zero? and, although this is never specifically stated, assume tnat tae 
sole e.itecx. of a restrictive time limit is to increase the total test variance: 
and-Wce, by definition, the estimated reliability of the test. W^ile this is 
cejtainly a lilcely outco=ie it does not automatically follow that it will always 
be ^. 

The actual effect of a restrictive tii^e iisdt on the test statistics will 
vary, depending upon $he statistic being considered, "the degree of speededness, • 
and the characteristics of the group taking the test. Kunnally (1967, p.556) 
provides a good summary of these effects. 

"rne potential effects of restrictive time limits on the mean ' 
score are ob'/ious. If there are any effects at all, the e^ec- 
tation is that the mean will increase with increasing fractions of 



the ca3rforC2ile-ciiD£, vith little iiacrease beiag expected abore 
lOD percent of x^ue cosifortEble-cime. There is, boverer, i:d 
' strict relarioaihi? becveea t!i^ mesa and reliabilltj or %^idlry. 
A iseaa siear the center of tie tiseble tcore range tends to fsvor 
hi^b reliability, but tbe relar.icnship holds' in only a loos^ sta- 
tistical ^vzy." 

And Korris-oa points out the problenLS associated vith indices that are based en 
a single acudnistration pf one form, 

"rne major difficulty faced by all single-trial iisdices Is 
that any score v^ich inight b-e used (right, vrc:^, nraiver 
atteioted, nan^er right divided by the nrjnier attescpted) is 
psychologically coc^jlex, in that^it inay reflect both speed and 
ability influences imder tirae-liinit adizinistration. Ve sinply 
cannot tell frcfx ttae-li^i data alone vhat the effects of the 
time-linit vdll be." ^ 

Table 1 illustrates -u-hat niay occur to certain test statistics vhen a re- 
strictive time limit is ercployed. To, construct the table tyro cestee character- 
istics were considered. Tne first of! these vas ability, defined as the propor- 
tion of correct responses that a pers^m could corr€J:tly anK-er under untizsied 
conditions. Tvo ability levels, 0.9 ind 0.4, vere used. Tne second character- 
istic vas the speed at vhich a*testee 'could vorSc and this vas defined as th^ 
proportion of questions that he could ansver in th^ allowed tte#^i3iere were 
also fcvo levels, 0.9 and 0.4, of this characteristic. Twenty- tiree. different ^ 
distributions of testees vere considered to have taken a 100 itea test* The 
table gives these distributions and the expected values of three test statistics 
under both tiined and untined adninist rations. It should be noted that the untiiaed 
adsiinistrations, give results that luay be considered as representing the "true^ 
state of affairs. 



Insert Tablfe 2^out Here 



In all cases the obtained mean is a lov e^tiinate of the value^chat vould be . 
obtained under pure pwer coiiditions; and, as the correlation of 0^683 between 
the 3ieans obtained undeinntis^ and untined conditions shows, tha/relationship 
is not such as vould allow nuch confidence to be placed on the^alues obtained 
under ti=«ed conditions. Kucii the sase is true, albeit* to a greater extent, 
vith the estinsates of total test variance. Tae relationship/Detveen the two 
sets of variances is only 0.189 and, the tiaed values nay be/either nore or less 
than the untined values. It is this fact that inakes Cronh6ch*s proposed index, 
tan, so attractive. For it is the only index tOiat allow/ for the possibility 
that the total test variance nay be reduced when the t^t is given vith lass 
than a confortable tine linit. / 



ftox>os€ of a Test 



Before discassiag the assoEpticris that uaderly tfce use of tte SZ20, and 
tie iDcriif ication of the fcrmla that ad-gfet malie it possible tc 3Einiini2.e. J2:e con- 
sequences of the rioiatioa of the assnn^rioas , it cigjit be useful to briefly 
review the piirposes that a test is to serve. Altho:3^ there iLsy be disagreenieat 
on this point it is the writers belief that a test score serves but oue purpc^e: 
to estiii^te the testees present position on behavioral continxsoiiu The jmr- 
poses vny it is desired to laake Shis prediction inaj vary, but vill usually 
involve some type of cosoarison. There vill ^ivays require the coa^^arisoa of 
en entities present position v^th either the previous position of the saiae^or 
the previous position of another, entity. An entity may be either an individual 
or a group; in the latter case the iLean value of the group vould^be used* 

mm 

The purpose of the reliability coefficient is thus quite clear. It allows 
the pers<m inaking the cosparis^Cs) to sake a judgement as to how veil the 
positions on the continuum have been detendned: a judgement of particular i^s- 
portance whenever it becomes necessary to interpret the findings of a study 
whose goals were to determine whether there were differences between two or 
more estissates of position. A state of affairs such as would exLst under the 
first coiidition given in Table 1, ^nd presented in Table 2, would be highly 
tzudesiVable: since, under these conditions, only 25 percent of tiie subjects 
would have their actual true score included within the 95 pert:ent confidence 
lindts about their estimated true scor*e. One dan only VOTider how many negative 
findings, or failure to r^licate previous results, laay iiave be^ caus^ by 
conditions sisrilar to this. 



Insert Table 3 About Here 



The Assuarption Underlying the Interpretation of the KR20 Beliability Coefficient 

The KR20 reliability coefficient can be calculated by any «0f several different 
fonnulae: two of which are given below. 



Insert Fonsulae 2 & 3 About Here | 



, Sicg)ly stated., and without going into a detailed psychonietric explanation, 
the KR20 reliability coefficient can be considered as the correlation between a 
persons observed test score and his "true" score on the donain that the test .pur 
ports to neasure. Fonzula 2 expresses this in tema of the inter-correlations 
of the k iteas that nake up the test; while formula 3, which is natbeaatically 
eqUiJ.valetit to 2, uses the total test variance aiid the sun of the iten variances . 



/ 




Since the fojtolae are baged oa po?nlatic?a pataaeters the assns^ticTa that vast 
be net vheasreT seicple estimates are used is that these s£ii?le statistics fflUst 
be tribtased estinat^ of taelr correspocdicg jx^mladca parsoeters. It voal<3 
appear that> in the case of the K220, csro coaditicas inist be satisfied if th^ 
asstn^cioa is to be met. Tee first of these is relatively easy to satisff and 
requires only that the stsbjects be a r aod o n sasDle froa the populatioa of 
interest. The second coaditioa will be satisfied whenever each person in the 
S23:5>le has been given suf f icimt tixie to allow hisa to answer every itea to which 
Ivs iaiow:? the -ansver. To the extent that this is not so the sas^le statistics 
will'be biased estiscates of their associate! paxssieters, with the precise j^tnre 
of the bias being cocrpletely uaknown. in cases snch as this not only would 2& 
calculated value of the KE20 be uninterpret^le, but the test scores of ^se 
subjects vho had insufficient tins to coc^letfe the test would be unknown. Table 
4, which will also be used to illustrate a laodif icaticn of the traditional K220 
fonmla that would adjust for the effects Sf a restrictive time liait, demon- 
strates the indetentdnacy of certain of the reqxiired test sta^stics under a 
speeded test administration. , ^ 



ERIC 



insert Tablfefft- Abou3t Here 
f^-^ ... 




There are, in all cases, n plus|^^p.eces of infonsation required for the 
calculatioa of the necessary statis^^~ the k column sums that will provide 
estimates of the item variances, an«^ n row sums that allow the estimated 
test variance to be calculated. fexample 17 of the 30 pieces of infor- 

nation (four of the row totals and|c^^een of the column ■ totals) are indeter- 
rdnate. The usual practice Is to l^^an omitted item as wrong: assuming that 
the omission was caused by a lack^^owledge as to the correct answer and not 
"by a possible failure to teach thl^^ The Item difficulty is thus defined 
as the number of correct respons|8r;TUjHLded by the number of sid>jects. This 
is equivalent to assuming that tbe ^^ed with which a person can answer test 
questions is perfectly correlated xdA his knowledge of the material to which 
the .test questions pertain, rnfs'^ssumption is, in way, stq)ported by a 
rather extensive body of researci^^jfith, 1971). 
- ' ' ''^ 

It would, however; be possibleg^ estimate the item variances by using only 
the available information. The 6sta*ted item difficulty would then be the 
number of correct responses divideil^ the number of persons who reached the 
item. This would certainly be a scw^ practice in that the obtained values would 
.be based only on available infomati^ and are unbiased cstitaates (although some 
of them, bVing based on rather saHtf^sasples, night have quite wide confidence 
linits) of the population values. same procedure cannot, unfortunately, 

often be used to obtain estimates the raw sums: the persons total test score. 
This is caused by the widespread ^fctice among American test constructors of 




{ 



flTTffngi ng the itess vitiiin a test- in order of increasing difficulty (i.e., the 
easiest itea is placed first, followed by the next easiest, and so cm to tie 
rjDSt difficult ites). There is thus a confounding of item difficulty vith 
itea placement, and the estiisation procedures used vith the i^ea variances vouid 
require an assuzotion that i5, the definition, of the procedure used to 
determine item pTaceaent, iizpossible: the probability of correctly answering 
an item is independent of item plac€m.ent» 

There are xr ^ ap proaches that iiiay be used to solve the probleau The first 
of these stakes use of the ites; inter -correlations ronirjIa"aid 'uses~dnly the avail- 
able data to estiisate the correl3;:ioxxs. correlation prograa that vill 
benffl^e trissing values co^ild be used to carry out the necessary calculations. 
Ta^^stijLated reliability of the test could then easily be calculated. The 
txd^ighted sizni of the iten inaans would serve as the best estimate ©f the mean 
^re of the population froa which the sasple was drawn. Tais approach appears 
K^be quite useful for those cases where it is desired to estimate the position 
^ a group on the behavioral continuum of interest. It would not, however, be 
Xf any value for those instanced vAere it was desired to make statements about 
the position of individuals* . ^ *" • 

Ihe second approach would require unbiased estimates of the probability 
that a person would have correctly answered unreached items had he been given 
the opportunity to do so. This is, in essence, a sampling problem; and either 
of two methods may be used to obtain the required estimates. Both methods require 
that the traditional American procedure of arranging items in order of increasing 
difficulty (i.e., the easiest item first, followed by the next easiest item, and 
so on) could no Idnger be used. In the opinion of aie author this would cause 
no great harm, since there does not appear to be any meanljigful reason for this 
type of item arrangement. Tne first of these methods, which is the simplest, 
requires that the items be randomly assigned to their position within the test. 
Ibis would allow an estimate to be made of a subjects test score by using formula 
4: vhich defines that score as the number of correct responses^ divided by the 



Insert Formula 4 About Herq 



nidber of items attempted, and multiplied by the number of items in the test. 
ThiFlaethod of estimation is hereafter referred to as condition 2 and its xise i^ 
illustrated in Table 3. 



Insert Table 3 About Here 



75i0 other approach to item placement, hereafter called condition 3, is 
saaewhat more complex in that it employs stratified, ratber than simple, random 



6. 



item plecemfpntc The procedure Is cozcBDaly used in Eaglend appears to iisve 
been first proposed by Ellis in 1923. Toe advantages, associated 'vifch its use 
«re tbe samj as vttea any stratified razidosi sacDle is used: greater precision of 
tiie resulting estloiates* Uxider this procedure the itezis are first categorized 
into a levels of difficulty. One itea froa each difficulty level is then 
rsidoaly selected and these c test, iteas, arranged in ascending order of diffi- 
culty, are the first s iteas in the test* Ihe procedure ^s repeatetf^ in turn, 
tmtil all itens have been placed in the test. The test vill ^us consist of a 
series of cycles each ox vhich contains n test items. Each cycle is therefore 
a representative sascple of the entire range of tasks vithin the dosiain being 
measured by the test. To illustxate hov this would vork cozxsider'a lOO itecn 
test In vhlch each itesi has been assigned to one of five levels of difficulty: 
0.99-0.81, 0.&0-0,61, 0.60H3,41, 0.40-0.21, and 0.20-0.01. Siere voald chus be, 
•in this case, tventy co23)lete cycles and it would be a relatively simple natter 
to estiiLate a sxibjects perforsiance on any unreached itenis. Ihe person by items 
response matrix that is presented in Table 3 consists of four cycles of five "itess. 



Dsing this sjethod a subjects test score can be viewed as consisting of m 
separate parts: vhere a is the mrsber of itesi difficulty levels (strata) used 
in the test. His score, in those cases where all iteais were not reached, is 
thus the sum of the s adjusted level scores. Each of these level score#"(fonmla 
5), is cafculated by dividing the nirsber of correct responses to the itesis within 
a level by the nu^sber of iteas within that cycle that were reached, and multi- 
plying the result by the total ntober of itesis within that cycle. Two points 
should be mentioned at this tiiae* First, it is not necessary that each strata 
contain the saae nusiber of itess, althougji the computations required become 
easier if thi^ is the case; and, second, both of the methods described above 
provide results that are identical with those obtained froa the traditional 
method of cozrputing test scores, vhenev.er ail subjects have been given sufficient 
tine to attenq>t each test iten. It follows that the various t^st statistics will 
also be the sane under such conditions and the traditional fomula for calculating 
the KR20 can thus be viewed as a special case for a laore general fonaula th^t 
estimates the responses to unreached Items by using the available information 
of that subjefct* r -.^ 

V 

As the exar^le that is presented in Table 3 illiistrates, the .Results obtained 
by the two modified formulae are in very close /agreemeijt: and both differ con* 
siderably from, those obtained when the omitted responses are treated as wrong 
answers* In the example the modified formulae give increased vaj.ues for the 
total test vaniance, the sun of the itea variances, and die reliability of the 
test. The standard error of measurement is thtis, for the modified fonaulae, 
saall^. The correlations between the three sets of scores, which are also given 
In Table 3, are quite interesting. 

In order to determine what night happen po the various test statistics in 
other circumstances an analysis of forty tests that had been submitted to the . 
Office of Examination Services was carried out* The vaaiouS test statistics 
were computed for each test by two different nethods: treating omits as' incorrect 
responses (condition 1) and treating omits as unreached itans under the assumption 



8 



ERIC 



f 

7. 

f 



o£ tettdca ites pLaceaent (condition 2). The results of this analysis \re 
given In Table 4 and support the earlier statedent that it is isnpossible to 
Mty, in advance and for any given test, vnat \n.ll happen to tie test statistics. 



Insert Table 4 About Here 



Predictive Validity of ^€\todified Estlziatfes 

One final ana^sis vas carried, out* to determine if there were any 
differences in thB^^ility of two test scores, one obtained froa the traditional 
ttetfiod and one obtained frca the modified cethod (stratified itea placement), to 
predict various criteria. Tne subjects vere the 82 first tern fresKaan enrolled 
ixT^ required course in introductory ps3rchology at a large mid-vestem state 
unlversiry. Each subject was adsiinifetered the Verbal and Quantitative sub-tests 
of the College Qualification Test. This is a cosnercial test published by Tne 
Psychological Corporation. The Verbal sub-test consists of 75, four choice, 
verbal analogies vhile the Qixantitative sub- test consists of 50, four choice, 
tnatheaatical questions. Although the.^tests are stated to be untimed there'is 
6 recoznendea time liait and this was used. Both of tliK^es^^ i^ad, on the basis 
o£ itea difficulty data previously supplied by the publisher, been modified into 
condition 3 (stratified it^ placement). Each cycle contained five tast l^teas. 
B&feh subject thus had six different scores: 'the CQT-V and CQT-Q computed using 
the traditional (condition 1) Soraula; the CQT-V and CQT-Q c<Raputed using the 
fetratifi^d (condition 3) fonaula; and two estLiiaates of subject speed. Thesfe 
Ife^jt two scores were calculated by dividing the number of iteps that had been ^ 
reached by the nimber of items in the test: and the possible values range from 
0%00 to 1.00. these were used as crude estimates of the rate at which a subject 
could perform the tasks sampled by the test items and may be considered, in a 
loose sense, as the speed with which a person can handle *new information. These 
tix scores, along with the sex of each subject, were -then used. as predictor 
variables for three different academic criteria; first terra grade point average 
In non-^th€toatics/science courses (CPAl), first term grade point average in 
jaathematical/science course (GPA2), and first term grade point average in all 
courses (GPAT). The results of .this series of anal^rses are given in Table 5.^ 



Insert Table 5 About Here 



Sex was not a significant predictor for either of the three triteria.^ As 
vould be expected, the verbal test score were the best single predictors of 
CPAl and the quantitative tests scores were the best single ptedictors ^f GPA2. 
Ihe multiple correlations were, in all cases, larger when the modified scores 
vere used as predictors. It was* quite interesting ' to note-that both of the 
speed indices were significant predictors of GPA2. This could be interpreted as 
meaning that, for the type of.material covered in these courses, the rate of 



response (taken as an indication of the rate of learning acquisition) is an 
lE5>ortant factor^ The two sets of test statistics g Hilar^^ Th ls is 

to be expected if, as is ^he case with the CQT, the tiae allowed is close to 
being sufficient for all' subjects to attempt all items. This analysis gives 
tentative support to the contention that more accurate, and therefore more 
UBeful, information is provided by the modified formulae* 

Recorrmendaticns 

Based upon the series of analyses herein reported it would appear that at 
least six recommendations are in order. ^ 

1) All consaercial test publishers should routinely provide information as 
to the degree of speededness associated with their tests. In the case of multi- 
level tests (i.e., those tests that are Used for several different age groups), 
especially those for use in the earlier grades, the information should be pro- 
vided for each level at which the test may be used. Should there be reason to 
believe that sub-divisions of the population differ in their response rates 
then this information should also be provided for the sub-divisions* 

2) Research as to which of the various indices Is most accurate should 

be catried out. In the interim any of the single administration indices refer- 
enced in this paper could be used to provide the needed information. 

3) Firms providing test analysis and reporting services should routinely 
calculate,, and provide as part of their services, the speed index of each group, 
administration of a test. This is a trivial problem of computer programming 
and requires a sub-routine of less than twenty stat^ents. 

4) Careful consideration should be given as to whether the present procedure 
nis4d to determine item location within a test should be changed. There is very 
little practical or theoretical reason to retain the present procedure; although 
there are several benefits that would follow the adoption of either a simple 
random, or a cyclical, arrangement of .items. - ^ 

5) Should the above be adopted the firms mentioned in 3) should also 
provide, if requested, test statistics, including the individual test scores 
based on the appropriate modified formula. Although the programming required to 
provide' this service is less trivial than was the case with the speed index. It 
is still a very simple matter. — , ■ ' - 

6) A program of research aimed at discovering whether the rate of response 
is indeed a separate measurable dimension should be initiated. Should this prove 
to be the case, and should response rate be, related to the rate of intellectual 
development, and the existing research in this area indicates that this may well 
be the case, the implications for education are self evident. 



Sigfnaty * * 

4 

Several -indices thatj have been proposed as estimates of the degree of 
speededness of a test were* discussed. With one exception, Cronbachs tau, all 
of these appear to be based on an unsupportabie rational: that the effects of 
a restrictive time limit wiU be to, in all cases, increase the total test 
variance. The effects of the speeded administration of a test were shown to 
result in test results that are basically uninterpre table: with the problem 
being caused by the bias*that ig introduced into theO^arious test statistics 
as a result of the insufficient time limit. It was ^ther demonstrated that 
the traditional KR20 formula is a special case, requj.ring the assumption that 
all subjects have been allowed sufficient time to attempt all of the test items, 
of a more 'general formula that does not require this restrictive assumption. 

An empirical study indicates that the scores provided by the modified 
formula were slightly better predictors of first term grade point average than 
were those scores provided by the traditional fpjnmila. 



Six recommendations, especially applicable to firms publishing tests or 
providing test analysis services, were given* ^ 



REFERENCES 



Croriback, L. J. & Warrington, W. G. Time-limilt tests: Estimating their 

reliabil-ity and degree of speeding. Esychometrika , 1951, 16. / ^ 

.Cronback, L. J. & Warrington, W. G. Time-limit tests: Estimatin^-^thefr 

reliability and degree of speeding. Psychometrika , 1951, IS, 167-188. 

Ellis, R. S. A method of constructing and scoring tests with time limits 
to eliminate or weigh the effect of speed. School and Society , 
1928, 28, 205-207. 

Gulliksen, H. The reliability of speeded tests. PsVchometrika , 1950, 
15, 259-269. ' 

Helms tadter, G. C. & Ortmeyer, D. H. Some techniques for determining the 
relative magnitude of speed and power components of a test. Educational 
and Psychological Measurement > 1953, 13, 280-287. 

Mollenkopf, W. G. Time limits and the behavior of test takers. Educational 
and Psychological^Measurement , 1960, 20> 223-230. 

Morrison, E. J. On test variance and the dimensions of the test taking 
situation. Educational and Psychological Measurement , 1960, 20, 
231-250. . 

Nunnally, J. C. Psychometric Theory . 'McGraw-Hill, New York, 1967. 

Smith, D*. M. The validity of factor score estimates of speed and accuracy 

as predictors of first term grade point average. (D(^ctoral dissertation, 
I Florida State University) Ann Arbor, Michigan: University Microfilms, 

' 1971. 

Toops, H. A. 'A cbmparison, by work-limit and time-limit, of item-analysis 
indices for practical test construction. Educational and Psychological 
Measurement, 1960, 20, 251-266. ./^ 

Wesman, ^ G. Some effects of speed in test use. Educatjl^pnal and 
. Psychological Measurement , 1960, 20, 267-274. '\ 



13. 



• 


• 

• 


m 

Table 1 


- 






* 






Expected Values of Test Statistics 








• 


l?nder Timed and Uiitiined Condi tioas 


for 










Groaps of differing Chatecteristics 








?tx>portica of SaH?le 


















t 


Tiiaed Adaiaistratioa ' 


Ihttisted Adstiaistratlos 




S«.9 S«.4 S=.9 




Kean 7ariaxice t31\ 




Variance 


KR21 


Eo 


■ .0.25 0.25 0.25 


'C.25 


^2.25 567.19 


.967 


65.00 


625.00 


.973 


•. 1 


6.50 O.OO 0.50 


'b.oo 


58.50 5G6.25 


.962 


65.00 


625.00 


.973 


2 


0.00 0.50 0.00 


0.50 


26.00 lOO.OO 


.816 


65.00 


625.00 


.973 


3 


0.50 0.50 0.00 ' 


0.00 


58.50 505.25 


.962 


90.00 


0.00 


I 


4 


O.OO O.OO 0.50 


O.OO 


26.00 ' 100. OO 


.816 


40.00 


0.00 


J" 


5 


.0.50 0.09 O.Op 


0.50 


^.:>0 rCo6.2:> 




65.00 


625.00 


•973 


6 


O.CO 0.50 0.50 


O.OO 


36.00 0.00 


I 


65.00 


625.00 


.973 


7 


1.00 O.OD 0.00 


O.OO 


Sl.tyo 0.00 


I 


96.00 


0.00 


I 


8 


0.09 O.OO 1.00 


O.OO 


36.00 O.OO 


I 


40.00 


0.00 


I 


9 


O.OO l.OO O.OO 


O.OO 


36.00 0.00 


T 


90.00 


0.00 


I 


" 10 


0.00 . O.OO 0.00 


1.00 


16.00 0.00 


I 


40.00 


0.00 


I 




0.70 O.OO 0.30 


0.00 


67.50 425.25 


.958 • 


75.00 


523.00 


.97^ 


12 


.0.30 0.00 0.70 


0.00 


49.50 425.25 


^ .951 


55.00 


525.00 


.962 


13 


0.70 0.30 0.00 


o.-oo 


67^50 425.25 


• .958 


90.00 


' 0.00 


I 


14 


■0.30 0.70 0.00 


0.00 


«9.5u 423.23 


— »951 


90.00 


0.00 


I 


15 


, O.OO 0.00 0.70 


0.30 


30.00 84.00 


.758 


40.00 




I 


16 


0.00 0,00 0.30 


0.70 


22.00 84.00 


.812 


40.00 


O.OQ 


I 




O.OO 0.30 0.00 


0.70 


22.00 84.00 


.812 


55.00 


525.00 


.962 


18 , 


O.OO 0.70 0.00* 


0.30 ' 


30.00 84.00 


.758 


'75.00 


525.00 


.974 


19- 


0.70 • 0.00 0.00 


0.30 


-61.50 851.41 


.982 


75.00 


525.00 


.974 


20 


0,30 O.OD 0.00 


0.70 


35.50 887.25 


.984 


55.00 


525.00 


.962 


21 


0.00 0.30 0.70 


0.00 


36.00 0.00 


I 


55.00 


525.00 


.962 


.22 • 


0.00 *0.70 0.30 


0.00 


36.00 0.00 


I 


75.06 


525100 


.974 


23 




6825 






1890 


> 










Table 2 


\ 










True .Scores, Egtiisated True Scores aivd Confideace Linits \ox Saaple Kuniber J 






Observed 


Sstiziated 


95 Percfent 


True 






Characteristics 


Score 


True Score 


Confidence Llaits 


Score 






81 


79.72 


71.24 - 


88.20 


. ■ 90 








36 


36.21 


27.73 - 


44.69 


y90 








36 


•36.21 


27.73 - 


44.69 


'40 








16 


16.87 . 


8.39 - 


25.35 


40 






• 

§ 




13 










■ 


ERIC 












4 





u 

2 

U 



cs 
•a 



o ^ 
I ^ 

Si c 
o 



HP- ^ 

7j 5 

O « 

_ > 



C M 
c > 

CO Q 



o 



60 o 

U «H 

O •IS 

O c 

CO O 

o 



o 








o 


so 


CO 


NO 


o 


• 




• 













*^ 










CO 


CO 









o 

o 
u 



s 

CO 





o 


o 








• 




• 













o 














o 












• 




• 




• 












o 




a* 




<*> 






CO 




a* 


• 




• 




• 


O 




CM 




CM 


1-1 




l-t 




i-t 










•3 






•a 




O 






o 
















»H 








o 


Q 








•3 


C 








O 


O 










«r< 


















•3 


•H 


•3 




S 


•3 


C 


» 




O 


S5 


O 




o 


o 


U 








H ^co 



en 



if 4 



o 



o o 
o o 
o o 



O ip-e w 



o 



e 
o 

z 

»-( 

o 
u 
u 
o 

o 
t 

v< 
o 
a 
c 

HI 



o 
o 
o 



CM tn 



C C 
C 

•3 O 
O Q 

o o 
> «c 
o « 

''I 

a © 

o ^ 
H o 

•3 

4J U 
O O 

•3 J< 
O S 
U C3 



o 

25 



Table 4 



Coag^rtscQ of Test Stietistlcs C&tained rraa the Azialvsis of Forry Testes 
r^^^ the Traditional KE20 Fcrrrala azsd the Stifle Rgr^dom I tea Flecenieat Kr>il£iction 



Test Variance 


Sum of the 
I tea Variance 


SE20 
Seliafailirv 


'S:snS>eT of 
OccOTences 






Increase 


3 


Increase 


Increase 


Decrease 


5 


Increase 


Decrease 


Increase 


6 








0 


I>ecrease 


Increase 


Increase 


0 


Decrease . 


Increase 


Decrease 


15 


Decrease 


Decrease 


Increase 


6 


Decrease ^ 


Decrease 


Decrease 


5 


2^te: The changes 
referenced 


given are cbcse of 
to the traditional 


the modified 
formula 


lommlay 



71 

4 



O B C^xs rr 
< &5 1» d <o 

Cfr fT <7 O ft 3 
^ 

*t» © 3 o 
O CuS rr 



S.I 

o 



fT fT fT fT :r 



:3 



i> e o c 0 

< < S -53 

> > S 

*1 n 3 o o 

^ »^ *o 

^ ^ O O O 

3 3 1 M 1 

O O ft rr 

O O O 1^ 1-^ 

4-a O O 

o o :3 :3 

rr o O 

& ^ •© ITj f-S» 

:3 :5 3 

ft rr O li »i 

O 4» o c tc 

3 3 »i c c 

C C S & 

3 3 C 

& O. -rr o <5 

ft ft » ft © 

:r sr 3 o o 

O it o rr <t 

3 rr o v< ^ 
O 1 

O. ^3 d 9 

3 3 

fT ^ C C 

C O O 

O 3 n »1 

3 O. 

» P 3 3 

O i-» < tP » 
O 

H- O C 3 3 
3 O » 
tp »i 

H-^-rf ft 

2 3 O t» 

CO O 3 3 
^ n 

O »1 c c 

O C ft 3 3 

rr fi. 

O O 0 O 

o ^ 1 1 
3 

6 rr 
Coo 



c 
o 
n 

o 

Hi 

ft 



rt 
3 



o 
o 

n 

o 
rr 



1 2 H- 



^ O rr 

O e. H- 

Kb H- O 
H> 3 

O H- P 

O d 

n » 

O 9 O 

D O O 

ft O »1 

O 3 3 

•3 »- 

O f1 PI 

3 £ C 

O i-» M 

0 O 9 

» » ti 

ft 

O 

3 



ft 

a 

3 




3 3 

COB 

C C 3 

H- o 5 

^ "I IS 

f-»- f-^ c 

C O 3 » 

fiu fi. :p »i 



O D 

O O 

3 3 

c c 



ft 

3 

^: 

s» 

o 

3 

ft 
/t 

a 



o o 

3 3 



o o 

O 3 

3 O 

ft rt 

H- ft 

o 2 

3 3 

ft 



ft ft 3 

o o 

ft 
a 
3 



r>) ^ to ui \p 



rO i-rf j-» 1-^ H» 

• ••••••••• 

OOUi^OO-fi^tdOO 



OOOJOOO^U>00 



ERLC 



55 M, 
© 3 

•3 

O < 

3 

o c 
e 9 



D 3 
ft 

< 

CO H- 
O C 
O £ 

ca 



16 



/ 



ft 



ft #-t 

a >-» 
O 3 <^ 

5 d » 
^ 3 Iff 

c a. »i 

ft & 
© ft 

xs < 
c ft a 

V/ 
3 rr a 

ft IS 

n ^* o 

» » 3 •-^ 
O ft p 

o & o 
ft 3 a 

3- CU3 

a o 

M 

H 3 
•1 H»*r5 

O < -3 

a o 

^ 3 
SCO 
a P o 
ft i-» 
- ,2: 
o H » 
o« a ft 

6 s li 

• CO*' 

o 

M 

o 

C9 



Table 5 



Regal ts of the Predictive Sttidy 
Including Yartsble Kezns and Stssdard Desdatioa 
fafe'Icst: Statistics 



Statistic Vari^le 







CQTv 


0C7TQ 


KCQT7 


. KCQTQ 


GFiu. 


GrA2 




• 


Kesa 








48.901 


24.353 


2 620 


2 


2 607 




Stsi De? 






13.510 


7.835 








• 


Strn item var 


17.791 


11.254 


16.381 


11.697 








S3-20*» 




0.SC3 


0.835 


0.923 


0.326 










SE Heas 




2. see 


3.197 


3,760 


3.265 

• 






• 












Xrtter-correlatioas 












Verbal 


Hath 








Verbal 


- 




Kath 




G?A 


G?A 


G?A 


COTV 


KOOT"*' 


Speed 


GOTO 


HCOTO 


Speed , 


Verbal G?A 


l.OD 


0. 63 


0.03 ^ 


0.59 


0.57 


0.45 


0.29 


0.33 


0.2/ 


Kath 


0.63 


1.00 


0.69 1 


0.40 


0.31 


0.30 


0.52 


0.61 


0.30 . 


local GBA 


0.65 


0.69 


l.OD • 


0.58 


0.57 --&r33 


0.28 


0.33 


0.27 


OQT? 


0.59 


0.40 


0.58 


l.OO 


0.85 


0.50 


0.34 


-0.31 


0.26 




0.57. 


0.31 


0.57 . 


0.-85 


1.00 


0.56 


0.19 


0.23 


0.21 


Verb Speed 


O.iS 


0.30 


0.33 


-0.50 


0.56 


1.00 


0.17 


0.21 


0.20 


cm 


0.29 


0.52 


0.28 


0.34. 


0.19 


0.17 


1.00 


0.92 


0.7Q 




0,33 * 


0.61 


0.33 


0.31 


0,23 


0.21 


0.92 


l.OO 


0.71 


Kath Speed 


0.27, 


0.30 


0.27 


0.26 


0.21 


0,2Q 


0.70 


0.71 




/ 



Regression Coefficimts 



CriteiSlon 

Traditional (Cond 1) Scores Kodified (Coad 3) Scores 





Verbal 


Math 


Total 


Verbal 


Kath 


•Total 


Predictor 


G?A 


G?A 


G?A 


G?A 


GPA 


GPA 


Intercept 


1.564 


1.184 


1.645 


0.935 


1.063 


1,050 




(0.059) 


(0.066) 


(0.053) 


(0.059) 


(0.060) 


(0.054) 


CQTV 


0.030 


0.014 


0.027 










(0.005) 


(O.0C6> 


' (0.004) 








KCQIV 








0.025 




0.024 








• 


-JO. 005) 




(0.004) 


.Verb Speed 










0.513 














(0.221) 




CQTQ - 




0.040 








i 




(0.00$) 


• _ * > 








KCQTQ 








0.018 


0.073 


0.016 










. (0.008) 


(0.011) 


(0.007) 


Kath Speed 










'-0.900 














(f 373) 




R 


0.589 


0.573 


0.579 


0.603 


0.671 


0.603 




0.346 


0.328 


0.336 


0.364 


0.450 


0.364 
/ 



Kote: Standard errors are in parentheses beneath the associated coefficien/ 



/ 



formlae 



^14^23 




'23 



I 



is the prcp^rttca of the total test variance 
that rr^ be attribnted to speed 
is the correlation between ?ona A given nnder 
timed conditions and Form 3 given paider 
TTnr<TT><vf^ conditions 

is tht correlation between Form A given nnder 
ratimed conditions* and Form 3 given/ under 
tirsed conditions 

is the correlation between Fonns A & 3 vfceff 
both are given under tinted conditions 
is the correlation between ForsiS A ^ 3 vfcen 
both are .given under untliced conditions 



1^ 



and j«l>K; k«l,?v 



is the ntTTTfber of itess in the 
is the correlation between the and the k 
test itei:ciS: with the values calculated under 
the assumption that all omitted 'itents are 
ittcorr^t responses 



th 



-f- 



V(X) -IV(k) 



V(Z) 



and 



Where: H 
X 



V(k) 



l?here: XC2j^ 



5 

n " 



is the,nasi>er of persons vho took the test 
is the ntimber of itenis in 'the test 
is the total test score of a person on the 
test. This is the mnnber of ^est iteas that ^ 
were correctly answred and all osiitted items 
are coxmted as incorrect respon^^s 
the variance of the H total test^ scores 



the- variance of the k 



th 



test itca 



f and k»l,K. 

is the score of the n^^ .person' who took tl^e 
test adjust^ under the assu=5)tion of sizs?\^ 
rand eg itea placement 

is the nurber of persons «ho took the test 

is the person who took the test 

is the mriber of itens'in the test* 

is the k^ itea in the test 

is "the nu^er o£ ite=3 correctly answered' 

by the n person . 

is the nurier of itens reached (atteaptjed) 
by the n^^ person 



(Co=cl'<J) 

(5) XC3ji « /"t^X^ and 18.144; 1-1,1 

Sihexre: ZC3jj is the score : . , the vio took the 
test adjnst^ xhier the assuirpttea of 
stratified randoa itca placenieat 
K ^ is ^e mijuber of persoas vbo took the test 
n is the person vbo took the test 

H is the mn&er of itea difficulty levels 

^ (strata) in tHe test 

^% - ' a is the item difficulty level of the oest 

* is the cm£>ei? of test itexs in the item 

. - - difficulty level 

C rrmi is the ttmber of jLten:3 in the difficulty 
level that ime n^° person correctly ansvered 
Ajjjjrj is the nu^i>fer of itesrs in the,B^^ difficulty 
level that the n^^ person reached (attempted) 




19 



